Quality Analysis of Macroprosodic F0 Dynamics in Text-to-Speech Signals

نویسندگان

  • Christoph Norrenbrock
  • Florian Hinterleitner
  • Ulrich Heute
  • Sebastian Möller
چکیده

We present a study on the relation between fundamental frequency (F0) and its perceptual effect in the context of text-tospeech (TTS) synthesis. Features that essentially capture the intonational (macro-prosodic) properties of spoken speech are introduced and analysed with regard to the following questions: (i) How does the prosodic variation of TTS signals differ from natural speech? (ii) Is there a functional relationship between the prosodic variation of TTS signals and its perceived quality? In answering these questions we present novel approaches for the construction of non-intrusive quality estimators. The results reveal a substantial degree of systematic influence of prosodic variation on TTS quality.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High-quality prosodic modification of speech signals

The aim of this work was to develop a procedure that allows prosodic modiications of speech signals without impairing the quality. The developed procedure is based on the Fourier analysis/synthesis technique with several improvements on the analysis side, such as the analysis of signals with rapidly changing F0 and the analysis of weak spectral components. Also for the modiication of the short-...

متن کامل

Stylisation and Symbolic Coding of F0 : a Quantitative Model

This paper presents a reversible model for the stylisation and the symbolic coding of macroprosodic fundamental frequency patterns. Prosodic labels are generated automatically from the speech signal and can be used to regenerate a synthetic F0 curve which is as close as possible to the original curve. The model has been tested successfully for 20 speakers in French and Italian.

متن کامل

Word segmentation in Persian continuous speech using F0 contour

Word segmentation in continuous speech is a complex cognitive process. Previous research on spoken word segmentation has revealed that in fixed-stress languages, listeners use acoustic cues to stress to de-segment speech into words. It has been further assumed that stress in non-final or non-initial position hinders the demarcative function of this prosodic factor. In Persian, stress is retract...

متن کامل

A New Approach to Detect Congestive Heart Failure Using Symbolic Dynamics Analysis of Electrocardiogram Signal

The aim of this study is to show that the measures derived from Electrocardiogram (ECG) signals many a time perform better than the same measures obtained from heart rate (HR) signals. A comparison was made to investigate how far the nonlinear symbolic dynamics approach helps to characterize the nonlinear properties of ECG signals and HR signals, and thereby discriminate between normal and cong...

متن کامل

Using Noisy Speech to Study the Robustness of a Continuous F0 Modelling Method in HMM-based Speech Synthesis

In parametric text-to-speech synthesis using Hidden Markov Model (HMM), the fundamental frequency (F0) parameter modelling is important because it has a direct effect on the prosody of synthetic speech. F0 is typically modelled by a discrete distribution for unvoiced speech and a continuous distribution for voiced, by using a multi-space distribution (MSD). However, F0 modelling using MSD-HMM i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012